499 research outputs found

    One-shot Neural Backdoor Erasing via Adversarial Weight Masking

    Full text link
    Recent studies show that despite achieving high accuracy on a number of real-world applications, deep neural networks (DNNs) can be backdoored: by injecting triggered data samples into the training dataset, the adversary can mislead the trained model into classifying any test data to the target class as long as the trigger pattern is presented. To nullify such backdoor threats, various methods have been proposed. Particularly, a line of research aims to purify the potentially compromised model. However, one major limitation of this line of work is the requirement to access sufficient original training data: the purifying performance is a lot worse when the available training data is limited. In this work, we propose Adversarial Weight Masking (AWM), a novel method capable of erasing the neural backdoors even in the one-shot setting. The key idea behind our method is to formulate this into a min-max optimization problem: first, adversarially recover the trigger patterns and then (soft) mask the network weights that are sensitive to the recovered patterns. Comprehensive evaluations of several benchmark datasets suggest that AWM can largely improve the purifying effects over other state-of-the-art methods on various available training dataset sizes.Comment: Accepted by NeurIPS 2022 (19 pages, 6 figures, 10 tables

    Testing for a Common Volatility Process and Information Spillovers in Bivariate Financial Time Series Models

    Get PDF
    The paper considers the problem as to whether financial returns have a common volatility process in the framework of stochastic volatility models that were suggested by Harvey et al. (1994). We propose a stochastic volatility version of the ARCH test proposed by Engle and Susmel (1993), who investigated whether international equity markets have a common volatility process. The paper also checks the hypothesis of frictionless cross-market hedging, which implies perfectly correlated volatility changes, as suggested by Fleming et al. (1998). The paper uses the technique of Chesher (1984) in differentiating an integral that contains a degenerate density function in deriving the Lagrange Multiplier test statistic

    Testing for volatility co-movement in bivariate stochastic volatility models

    Get PDF
    The paper considers the problem of volatility co-movement, namely as to whether two financial returns have perfectly correlated common volatility process, in the framework of multivariate stochastic volatility models and proposes a test which checks the volatility co-movement. The proposed test is a stochastic volatility version of the co-movement test proposed by Engle and Susmel (1993), who investigated whether international equity markets have volatility co-movement using the framework of the ARCH model. In empirical analysis we found that volatility co-movement exists among closelylinked stock markets and that volatility co-movement of the exchange rate markets tends to be found when the overall volatility level is low, which is contrasting to the often-cited finding in the financial contagion literature that financial returns have co-movement in the level during the financial crisis

    Do Language Models Plagiarize?

    Full text link
    Past literature has illustrated that language models (LMs) often memorize parts of training instances and reproduce them in natural language generation (NLG) processes. However, it is unclear to what extent LMs "reuse" a training corpus. For instance, models can generate paraphrased sentences that are contextually similar to training samples. In this work, therefore, we study three types of plagiarism (i.e., verbatim, paraphrase, and idea) among GPT-2 generated texts, in comparison to its training data, and further analyze the plagiarism patterns of fine-tuned LMs with domain-specific corpora which are extensively used in practice. Our results suggest that (1) three types of plagiarism widely exist in LMs beyond memorization, (2) both size and decoding methods of LMs are strongly associated with the degrees of plagiarism they exhibit, and (3) fine-tuned LMs' plagiarism patterns vary based on their corpus similarity and homogeneity. Given that a majority of LMs' training data is scraped from the Web without informing content owners, their reiteration of words, phrases, and even core ideas from training sets into generated texts has ethical implications. Their patterns are likely to exacerbate as both the size of LMs and their training data increase, raising concerns about indiscriminately pursuing larger models with larger training corpora. Plagiarized content can also contain individuals' personal and sensitive information. These findings overall cast doubt on the practicality of current LMs in mission-critical writing tasks and urge more discussions around the observed phenomena. Data and source code are available at https://github.com/Brit7777/LM-plagiarism.Comment: Accepted to WWW'2

    Unbiased Math Word Problems Benchmark for Mitigating Solving Bias

    Full text link
    In this paper, we revisit the solving bias when evaluating models on current Math Word Problem (MWP) benchmarks. However, current solvers exist solving bias which consists of data bias and learning bias due to biased dataset and improper training strategy. Our experiments verify MWP solvers are easy to be biased by the biased training datasets which do not cover diverse questions for each problem narrative of all MWPs, thus a solver can only learn shallow heuristics rather than deep semantics for understanding problems. Besides, an MWP can be naturally solved by multiple equivalent equations while current datasets take only one of the equivalent equations as ground truth, forcing the model to match the labeled ground truth and ignoring other equivalent equations. Here, we first introduce a novel MWP dataset named UnbiasedMWP which is constructed by varying the grounded expressions in our collected data and annotating them with corresponding multiple new questions manually. Then, to further mitigate learning bias, we propose a Dynamic Target Selection (DTS) Strategy to dynamically select more suitable target expressions according to the longest prefix match between the current model output and candidate equivalent equations which are obtained by applying commutative law during training. The results show that our UnbiasedMWP has significantly fewer biases than its original data and other datasets, posing a promising benchmark for fairly evaluating the solvers' reasoning skills rather than matching nearest neighbors. And the solvers trained with our DTS achieve higher accuracies on multiple MWP benchmarks. The source code is available at https://github.com/yangzhch6/UnbiasedMWP
    • …
    corecore